PML Journal Club: Teemu Säilynoja
2023-03-28
Nathan Robertson1, James M. Flegal1, Dootika Vats2, Galin L. Jones3
Journal of Computational and Graphical Statistics 2021, Vol. 30, No. 2, 324–334
“Simultaneous estimation of means and quantiles has received little attention, despite being common practice.” Robertson et al. (2021)
Let \(\pi\) be a probability density with support \(\mathcal X \in \mathbb R^d\) and \(X\sim \pi\).
Denote with \(\mathbf{m}:\mathcal X \to \mathbb R^{p_1}\) and \(\mathbf{q} : \mathcal X \to \mathbb R ^{p_2}\) the means and quantiles of interest.
Above,
\[ m_i = \mathbb E_\pi\left(g_i(X)\right) = \int_{\mathcal X}g_i(x)\pi(dx),\] for some \(g:\mathcal X \to \mathbb R^{p_1}\).
And
\[ q_i = F_{h_i}^{-1}(p_{q_i}) = \inf\left\lbrace v : F_{h_i}(v)\geq p_{q_i}\right\rbrace,\] with \(h:\mathcal X \to \mathbb R^{p_2}\), where \(V = h_i(X)\) is distributed according to \(F_{h_i}(v)\), which is absolutely continuous and has continuous density function \(f_{h_i}(v)\).
Let \(\pi\) be a probability density with support \(\mathcal X \in \mathbb R^d\) and \(X\sim \pi\).
Denote with \(\mathbf{m}:\mathcal X \to \mathbb R^{p_1}\) and \(\mathbf{q} : \mathcal X \to \mathbb R ^{p_2}\) the means and quantiles of interest.
Even if we find \(\Xi\in \mathbb R ^{p \times p}\) s.t.
\[(\hat\nu_n - \nu) \to \mathcal N(0, \Xi), \quad \text{as $n \to \infty$,}\]
visualizing the elliptical confidence regions is difficult.
Multivariate central limit theorem for any finite combination of sample means and quantiles under the assumption of a strongly mixing process. 1
Fast algorithm for constructing hyperrectangular confidence regions. 2
Let \(\left\lbrace X_t \right\rbrace\) be strictly stationary and strongly mixing.
Let \(A_h\) be a \(p_2 \times p_2\) diagonal matrix with \(A_{h}[i,i] = f_{h_i}(q_i)\) and define \[ \Lambda = \begin{bmatrix} I_{p_1 \times p_1} & 0_{p_1 \times p_2}\\ 0_{p_2 \times p_1} & A_h. \end{bmatrix} \]
If \(Y_j = \begin{bmatrix} g(X_j), I(h(X_j) > \mathbf q)\end{bmatrix}^T\), and
\[ \Sigma = \text{cov}\left(Y_1, Y_1\right) + \sum_{j=2}^\infty\text{cov}\left(Y_1, Y_j\right) + \text{cov}\left(Y_1, Y_j\right)^T\] is positive definite, then
\[\sqrt n (\hat\nu_n - \nu) \to \mathcal N(0, \Lambda^{-1}\Sigma\Lambda^{-1}).\]
Lower bound: coverage at most \(1-\alpha\)
Upper bound: coverage at least \(1-\alpha\)
Upper and lower \(p\)-dimensional confidence intervals for \(\nu = \begin{bmatrix}\mathbf m \\ \mathbf q \end{bmatrix} \in \mathbb{R}^{p_1 + p_2}\).
Let \(\hat\Lambda^{-1}\hat\Sigma\hat\Lambda^{-1}\) be a strongly consistent estimator of \(\Lambda^{-1}\Sigma\Lambda^{-1}\).
\[\begin{align} C_{SI}(z) :=& \prod_{i=1}^{p_1}\left[\bar m_{i} - z\frac{\hat\sigma_i}{n}, \bar m_{i} + z\frac{\hat\sigma_i}{n}\right]\prod_{j=1}^{p_2}\left[\hat q_{j} - z\frac{\hat\gamma_{j}}{n}, \bar q _{j} + z\frac{\hat\gamma_{j}}{n}\right] \end{align},\]
where \(\hat\gamma_j\) is the \(j\)th diagonal element of \(\hat A_h^{-1}\hat\Sigma_h\hat A_h^{-1}\).
\(C_{LB} = C_{SI}\left(\Phi^{-1}\left(1-\frac\alpha 2\right)\right)\) has coverage of at most \(1-\alpha\).1
\(C_{UB} = C_{SI}\left(\Phi^{-1}\left(1-\frac\alpha {2p}\right)\right)\) has coverage of at least \(1-\alpha\).2
Find \(C_{\alpha} = C_{SI}(z_\alpha)\) with coverage \(1 - \alpha\) s.t. \(C_{LB}\subseteq C_{\alpha} \subseteq C_{UB}\)
1-D optimization task w.r.t. \(z\in \left[\Phi^{-1}\left(1-\frac\alpha 2\right), \Phi^{-1}\left(1-\frac\alpha {2p}\right)\right]\)
Use quasi-monte carlo methods to evaluate the coverage level w.r.t. the multivariate normal distribution.